26/06/2018

Welcome, and overview

First session:

  • Not heaps of paleo-specific R
  • But building blocks to make you an expeRt
  • Things that go into R (data inputs)
  • How to structure your data inputs and outputs
  • Getting started in R

Welcome, and overview

Second session:

  • Data validation
  • Data visualisation
  • NMDS
  • RDA
  • Plotting NMDS etc for publications
  • Saving and export

BD (before data): project structures

  • Personal choice… BUT
  • Keep these separate:
    1. Raw data (as entered)
    2. Corrected and modified data
  • Keep a record of how you went from (1) to (2) - even if you don’t do it in R
    1. Correct/modify data in R (with reminders)
    2. Create a new spreadsheet and keep a .txt records

Project structures

Where should a project live?

Pros and cons of the following:

  • MW-LCR shared drives
  • Dropbox (C:/)
  • Github (C:/)
  • MW-LCR personal drive

Where should a project live

githubScreenshot

githubScreenshot

Within the project: naming files

Machine readable

  • no punctuation symbols
  • no spaces
  • be careful with capitals
  • for data, easy to parse

Machine readable

  • e.g. year_site_coreNUM_type
  • e.g. 2018-06-31_eweburn_X18-062_concentrations.csv
  • e.g. 2018-06-31_eweburn_X18-062_age-depth.csv
  • e.g. 2018-06-31_eweburn_X18-062_species-dictionary.csv

note that we separate units of metadata with a "_" and within units, with a “-”.

Human readable

  • This applies to scripts and data

  • e.g. 1_data-cleaning-vegetation.R
  • e.g. 2_data-cleaning-species-dictionary.R
  • e.g. function_plot-all-species.R
  • e.g. function_clean-italics-tilia.R

Group discussion - data & spreadsheets

example data screenshot

example data screenshot

Booting up the R

rstudio

rstudio

The basics (1)

getwd()
## [1] "/Users/Liv/Documents/paleo-R-workshop/1-folders-spreadsheets-organisingData"

The basics (2)

setwd()

The basics (3)

sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.5
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] DiagrammeR_0.9.2
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.16       highr_0.6          pillar_1.2.1      
##  [4] compiler_3.4.3     RColorBrewer_1.1-2 influenceR_0.1.0  
##  [7] plyr_1.8.4         bindr_0.1          viridis_0.4.0     
## [10] tools_3.4.3        digest_0.6.14      jsonlite_1.5      
## [13] viridisLite_0.2.0  gtable_0.2.0       evaluate_0.10.1   
## [16] tibble_1.4.2       rgexf_0.15.3       pkgconfig_2.0.1   
## [19] rlang_0.2.0        igraph_1.1.2       rstudioapi_0.7    
## [22] yaml_2.1.18        bindrcpp_0.2       gridExtra_2.3     
## [25] downloader_0.4     dplyr_0.7.4        stringr_1.3.0     
## [28] knitr_1.18         htmlwidgets_1.2    hms_0.4.2         
## [31] grid_3.4.3         rprojroot_1.2      glue_1.2.0        
## [34] R6_2.2.2           Rook_1.1-1         XML_3.98-1.9      
## [37] rmarkdown_1.9      ggplot2_2.2.1      purrr_0.2.4       
## [40] readr_1.1.1        tidyr_0.8.0        magrittr_1.5      
## [43] backports_1.1.1    scales_0.5.0       htmltools_0.3.6   
## [46] assertthat_0.2.0   colorspace_1.3-2   brew_1.0-6        
## [49] stringi_1.1.7      visNetwork_2.0.3   lazyeval_0.2.1    
## [52] munsell_0.4.3

The basics (4)

require(tidyverse)
## Loading required package: tidyverse
## ── Attaching packages ──────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
## ✔ tibble  1.4.2     ✔ dplyr   0.7.4
## ✔ tidyr   0.8.0     ✔ stringr 1.3.0
## ✔ readr   1.1.1     ✔ forcats 0.3.0
## ── Conflicts ─────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Example data

# install.packages("vegan")
# install.packages("skimr")
require(vegan)
require(skimr)
data("mite")
data("mite.env")

Viewing data

head(mite, n = 6)
Brachy PHTH HPAV RARD SSTR Protopl MEGR MPRO TVIE HMIN HMIN2 NPRA TVEL ONOV SUCT LCIL Oribatl1 Ceratoz1 PWIL Galumna1 Stgncrs2 HRUF Trhypch1 PPEL NCOR SLAT FSET Lepidzts Eupelops Miniglmn LRUG PLAG2 Ceratoz3 Oppiminu Trimalc2
17 5 5 3 2 1 4 2 2 1 4 1 17 4 9 50 3 1 1 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 7 16 0 6 0 4 2 0 0 1 3 21 27 12 138 6 0 1 3 9 1 1 1 2 2 2 1 0 0 0 0 0 0 0
4 3 1 1 2 0 3 0 0 0 6 3 20 17 10 89 3 0 2 1 8 0 3 0 2 0 8 0 0 0 0 0 0 0 0
23 7 10 2 2 0 4 0 1 2 10 0 18 47 17 108 10 1 0 1 2 1 2 1 3 2 12 0 0 0 0 0 0 0 0
5 8 13 9 0 13 0 0 0 3 14 3 32 43 27 5 1 0 5 2 1 0 1 0 0 0 12 2 0 0 0 0 0 0 0
19 7 5 9 3 2 3 0 0 20 16 2 13 38 39 3 5 0 1 1 8 0 4 0 1 0 10 0 0 0 0 0 0 0 0

Viewing data

## Skim summary statistics
##  n obs: 70 
##  n variables: 35 
## 
## ── Variable type:integer ────────────────────────────────────────────────────────────────────────
##  variable missing complete  n  mean    sd p0  p25  p50   p75 p100     hist
##    Brachy       0       70 70  8.73 10.08  0 3     4.5 11.75   42 ▇▂▁▂▁▁▁▁
##  Ceratoz1       0       70 70  1.29  1.46  0 0     1    2       5 ▇▆▁▃▁▁▁▁
##  Ceratoz3       0       70 70  1.3   2.2   0 0     0    2       9 ▇▁▁▁▁▁▁▁
##  Eupelops       0       70 70  0.64  0.99  0 0     0    1       4 ▇▃▁▁▁▁▁▁
##      FSET       0       70 70  1.86  3.18  0 0     0    2      12 ▇▂▁▁▁▁▁▁
##  Galumna1       0       70 70  0.96  1.73  0 0     0    1       8 ▇▁▁▁▁▁▁▁
##      HMIN       0       70 70  4.91  8.47  0 0     0    4.75   36 ▇▁▁▁▁▁▁▁
##     HMIN2       0       70 70  1.96  3.92  0 0     0    2.75   20 ▇▂▁▁▁▁▁▁
##      HPAV       0       70 70  8.51  7.56  0 4     6.5 12      37 ▇▇▃▃▁▁▁▁
##      HRUF       0       70 70  0.23  0.62  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##      LCIL       0       70 70 35.26 88.85  0 1.25 13   44     723 ▇▁▁▁▁▁▁▁
##  Lepidzts       0       70 70  0.17  0.54  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##      LRUG       0       70 70 10.43 12.66  0 0     4.5 17.75   57 ▇▂▂▁▁▁▁▁
##      MEGR       0       70 70  2.19  3.62  0 0     1    3      17 ▇▂▁▁▁▁▁▁
##  Miniglmn       0       70 70  0.24  0.79  0 0     0    0       5 ▇▁▁▁▁▁▁▁
##      MPRO       0       70 70  0.16  0.47  0 0     0    0       2 ▇▁▁▁▁▁▁▁
##      NCOR       0       70 70  1.13  1.65  0 0     0.5  1.75    7 ▇▃▂▂▁▁▁▁
##      NPRA       0       70 70  1.89  2.37  0 0     1    2.75   10 ▇▂▂▁▁▁▁▁
##      ONOV       0       70 70 17.27 18.05  0 5    10.5 24.25   73 ▇▃▂▁▁▁▁▁
##  Oppiminu       0       70 70  1.11  1.84  0 0     0    1.75    9 ▇▁▁▁▁▁▁▁
##  Oribatl1       0       70 70  1.89  3.43  0 0     0    2.75   17 ▇▁▁▁▁▁▁▁
##      PHTH       0       70 70  1.27  2.17  0 0     0    2       8 ▇▁▁▁▁▁▁▁
##     PLAG2       0       70 70  0.8   1.79  0 0     0    1       9 ▇▁▁▁▁▁▁▁
##      PPEL       0       70 70  0.17  0.54  0 0     0    0       3 ▇▁▁▁▁▁▁▁
##   Protopl       0       70 70  0.37  1.61  0 0     0    0      13 ▇▁▁▁▁▁▁▁
##      PWIL       0       70 70  1.09  1.71  0 0     0    1       8 ▇▁▁▁▁▁▁▁
##      RARD       0       70 70  1.21  2.78  0 0     0    1      13 ▇▂▁▁▁▁▁▁
##      SLAT       0       70 70  0.4   1.23  0 0     0    0       8 ▇▁▁▁▁▁▁▁
##      SSTR       0       70 70  0.31  0.97  0 0     0    0       6 ▇▁▁▁▁▁▁▁
##  Stgncrs2       0       70 70  0.73  1.83  0 0     0    0       9 ▇▁▁▁▁▁▁▁
##      SUCT       0       70 70 16.96 13.89  0 7.25 13.5 24      63 ▇▇▆▅▂▁▁▁
##  Trhypch1       0       70 70  2.61  6.14  0 0     0    2      29 ▇▁▁▁▁▁▁▁
##  Trimalc2       0       70 70  2.07  5.79  0 0     0    0      33 ▇▁▁▁▁▁▁▁
##      TVEL       0       70 70  9.06 10.93  0 0     3   19      42 ▇▁▁▂▁▁▁▁
##      TVIE       0       70 70  0.83  1.47  0 0     0    1       7 ▇▁▁▁▁▁▁▁